Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture

نویسندگان

  • Joseph Gebis
  • Leonid Oliker
  • John Shalf
  • Samuel Williams
  • Katherine A. Yelick
چکیده

The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software controlled scratchpad memories, such as the Cell local store, attempt to ameliorate this discrepancy by enabling precise control over memory movement; however, scratchpad technology confronts the programmer and compiler with an unfamiliar and difficult programming model. In this work, we present the Virtual Vector Architecture (ViVA), which combines the memory semantics of vector computers with a software-controlled scratchpad memory in order to provide a more effective and practical approach to latency hiding. ViVA requires minimal changes to the core design and could thus be easily integrated with conventional processor cores. To validate our approach, we implemented ViVA on the Mambo cycle-accurate full system simulator, which was carefully calibrated to match the performance on our underlying PowerPC Apple G5 architecture. Results show that ViVA is able to deliver significant performance benefits over scalar techniques for a variety of memory access patterns as well as two important memory-bound compact kernels, corner turn and sparse matrix-vector multiplication — achieving 2x–13x improvement compared the scalar version. Overall, our preliminary ViVA exploration points to a promising approach for improving application performance on leading microprocessors with minimal design and complexity costs, in a power efficient manner.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FlashVM: Virtual Memory Management on Flash

With the decreasing price of flash memory, systems will increasingly use solid-state storage for virtual-memory paging rather than disks. FlashVM is a system architecture and a core virtual memory subsystem built in the Linux kernel that uses dedicated flash for paging. FlashVM focuses on three major design goals for memory management on flash: high performance, reduced flash wear out for impro...

متن کامل

An Evaluation of cJava System Architecture

In this work, we propose a new distributed run-time environment, which we called cJava, that enables multithread Java applications to execute in clusters transparently. Our implementation of cJava supports the distributed shared memory (DSM) which required significant extensions to the original Java virtual machine (JVM). First, a distributed object manager was incorporated to the JVM’s memory ...

متن کامل

A Framework for Memory Subsystem Exploration

Memory represents a major bottleneck in modern embedded systems in terms of cost, power and performance. Traditionally, memory organizations for programmable systems assume a fixed cache hierarchy. With the widening processor-memory gap, more aggressive memory technologies and organizations have appeared, allowing customization of a heterogeneous memory architecture tuned for the application. H...

متن کامل

The Impact on Latency and Bandwidth for a Distributed Shared Memory System Using a Gigabit Network Supporting the Virtual Interface Architecture

Previous studies have shown significant performance advantages in using Virtual Interface Architecture (VIA) instead of TCP/IP for handling network communication in the structured distributed shared memory system, PastSet. With the availability of network hardware that supports VIA, we wish to examine whether, and to what extend, an available hardware supported VIA implementation outperforms th...

متن کامل

An Analysis of Disk Performance in VMware ESX Server Virtual Machines

The performance of applications running within VMs is a significant factor in their adoption. VMware ESX Server was designed for high performance, and its architecture is streamlined to provide high-speed I/O. In this paper, we focus on one component of ESX Server's I/O architecture, its storage subsystem. We look at the characteristics of a series of disk microbenchmarks on several different s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009